unseen environment
Appendix: LanguageandVisualEntityRelationship GraphforAgentNavigation
Directional features As in previous work [3, 6, 10], we apply a 128-dimensional directional encoding byreplicating(cosฮธi,sinฮธi,cosฯi,sinฯi)by32times torepresent theorientation ofeach single-viewiwith respect to the agent's current orientation, whereฮธi andฯi are the angles of the heading and elevation to that single-view. Replicating the encoding by 32 times does not enrich its information but makes its gradient 32 times larger during back-propagation. We suspect that this benefits the agent to learn about the action-related terms (e.g.
CounterfactualVision-and-LanguageNavigation: UnravellingtheUnseen
Aprominent challenge is to train an agent capable of generalising to new environments attest time, rather than one that simply memorises trajectories and visual details observed during training. We propose a new learning strategy that learns both from observations and generatedcounterfactual environments.